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Abstract 


,  We  examine  iterative  methods  for  solving  sparse  nonsymmetric  indefinite  systems  of  linear 
equations.  Methods  considered  include  a  new  adaptive  method  based  on  polynomials  that  satisfy 
an  optimality  condition  in  the  Chebyshev  norm,  the  conjugate  gradient-like  method  GMRES,  and 
the  conjugate  gradient  method  applied  to  the  normal  equations.  Numerical  experiments  on  several 
non-self-adjoint  indefinite  elliptic  boundary  value  problems  suggest  that  none  of  these  methods  is 
dramatically  superior  to  the  others.  Their  performance  in  solving  moderately  difficult  problems  is 
satisfactory,  but  for  harder  problems  their  convergence  is  slow.  ^ 

1.  Introduction 

In  recent  years  there  has  been  significant  progress  in  the  development  of  iterative  methods  for 
solving  sparse  real  linear  systems  of  the  form 


Au  =  b,  (1.1) 

where  A  is  a  nonsymmetric  matrix  of  order  N.  One  key  to  this  progress  has  been  the  derivation  of 
polynomial  based  methods,  i.e.  methods  whose  m-th  approximate  solution  iterate  has  the  form 

Um  =  no  -r  9m-i(A)ro,  (1.2) 

where  uo  is  an  initial  guess  for  the  solution,  ro  =  b  —  Ait o,  and  qm-i  is  a  real  polynomial  of  degree 
m  —  1.  The  residual  rm=b  —  Aum  satisfies 


rm  =  [I-  Aqm-i  (A)]r0  =  pm(A)r0, 


(1.3) 


where  pm  is  a  real  polynomial  of  degree  m  such  that  pm(0)  =  1.  Applying  any  norm  to  (1.3)  gives 

lkm||  <  ||Pm(A)||||r0||. 

Moreover,  if  A  is  diagonalizable  as  A  =  U\U~l,  then 

i|Pm(A)||  =  ||{7pm(A)f7-l||  <  IIW1!!  max>m(A)|, 

so  that 


Ag  <t(A) 


rm|!  <  ||tf||||tf-1||  max  lpm(A)|  ||r0||. 


(1.4) 


A  eo(A) 

Thus  any  polynomial  pm  that  is  sufficiently  small  on  the  eigenvalues  of  A  is  a  good  candidate  for 
generating  an  iterative  method. 


The  conjugate  gradient  and  Chebyshev  methods  are  well-known  polynomial-based  methods 
for  solving  symmetric  positive-definite  systems  for  which  the  residual  polynomials  {pm}  have  desir¬ 
able  optimality  properties  [8].  Generalizations  of  these  techniques  have  been  developed  for  solving 
both  symmetric  indefinite  systems  (see  e.g.  [3,  4,  17,  18]),  and  nonsymmetric  systems  with  definite 
symmetric  part  (A  +  AT)/ 2  (see  e.g.  [5,  8,  14]  and  references  therein).  In  the  latter  case,  all  of 
the  eigenvalues  of  A  lie  in  either  the  right  half  or  the  left  half  of  the  complex  plane.  Sparse  linear 
systems  that  both  are  nonsymmetric  and  have  indefinite  symmetric  part  arise  in  numerous  settings. 
Examples  include  the  discretization  of  the  Helmholtz  equations  for  modelling  acoustic  phenomena 
[l]  and  the  discretization  of  the  coupled  partial  differential  equations  arising  in  numerical  semi¬ 
conductor  device  simulation  [12].  Gradient  methods  that  have  been  proposed  as  solvers  for  such 
problems  include  the  conjugate  gradient  method  applied  to  the  normal  equations  (CGN)  [9],  the  bi¬ 
conjugate  gradient  method  [7],  the  restarted  generalized  minimum  residual  method  (GMRES)  [20], 
and  new  methods  presented  in  [11,  26].  Smolarski  and  Saylor  [22]  and  Saad  [19]  have  proposed 
adaptive  polynomial  iteration  methods  of  the  form  (1.2)  using  polynomials  that  are  optimal  with 
respect  a  weighted  least  squares  norm.  In  this  paper,  we  introduce  a  polynomial-based  method, 
PSUP,  that  computes  a  polynomial  that  is  nearly  optimal  with  respect  to  the  Chebyshev  norm  on 
a  region  containing  the  eigenvalue  estimates  and  then  uses  this  polynomial  in  (1.2).  We  compare 
its  performance  with  the  two  gradient  methods  CGN  and  GMRES. 

In  Section  2,  we  give  a  brief  description  of  the  gradient  methods  CGN  and  GMRES.  In  Section 
3,  we  describe  the  new  PSUP  method  and  several  heuristics  developed  to  improve  its  performance. 
In  Section  4,  we  describe  numerical  experiments  in  which  these  three  methods  are  used  to  solve 
some  non-self-adjoint  indefinite  elliptic  problems,  and  in  Section  5  we  draw  conclusions  based  on 
the  numerical  tests. 

2.  Gradient  Methods 

In  this  section  we  briefly  review  two  conjugate  gradient-like  methods  for  solving  nonsymmetric 
indefinite  systems.  The  conjugate  gradient  method  [9]  is  applicable  only  to  symmetric  positive 
definite  linear  systems.  For  nonsymmetric  systems,  it  can  be  used  to  solve  the  normal  equations 
AT Ax  =  ATb.  The  scaled  residuals  {A^r*,}  satisfy 

ATrm  =  pm(ArA)Arr0, 

where  pm  is  the  unique  polynomial  of  degree  m  such  that  pm(0)  =  1  and  ||rm||2  is  minimum.  .As 
is  well  known,  the  condition  number  of  AT A  is  the  square  of  that  of  A.  Moreover,  the  standard 
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implementation  of  CGN  requires  two  matrix-vector  products  at  each  iteration,  one  by  A  and  one  by 
Ar,  plus  5N  additional  operations.  The  storage  requirement  is  4N  words.  The  dependence  of  CGN 
on  AtA  has  led  to  efforts  to  find  alternatives  that  are  more  rapidly  convergent  and  less  expensive 
per  step.  For  nonsymmetric  systems  with  positive  definite  symmetric  part,  several  methods  have 
been  shown  to  be  superior  to  CGN  [5]. 

GMRES  is  a  method  proposed  for  solving  nonsymmetric  indefinite  systems  that  avoids  the 
use  of  the  normal  equations  [20].  Given  an  initial  guess,  uo,  for  the  solution,  with  residual  ro,  this 
method  generates  an  orthogonal  basis  {iq, . . . ,vm}  for  the  Krylov  space 

Km  =  span{r0,Aro,...,Am_1ro} 

using  Arnoldi’s  method.  Let  v\  =  ro/|| r’olja •  The  Arnoldi  process  computes  for  j  =  1,. . .  ,m 

hij  =  (Avj,vi),  i  = 

; 

vJ+l  =  Avj  -  ^2  h{j  V{, 

■=i 

hj+iJ  —  Il%'+ill2, 

vi+ 1  =  Vj+x/hj+ij. 

GMRES  then  computes  an  approximate  solution 

m 

«m  =  uo  +  ^ayuy,  (2.1) 

>=1 

where  the  scalars  {ofy  }yL=i  are  chosen  so  that  ||rm||2  is  minimum.  These  scalars  can  be  computed 
by  solving  the  upper  Hessenberg  least  squares  problem 

min  UiMUei  -  Hmo|  , 

where  e\  =  (1,0, . . .  ,0)T  €  Rm+1  and  Hm  is  the  Hessenberg  matrix  of  size  (m  +  1)  x  m  whose 
(i,j)-entry  is  h,;-  [20].  By  the  choice  of  basis  and  the  minimization  property,  rm  =  pm(A)ro  where 
pm  is  the  real  polynomial  of  degree  m  such  that  pm(0)  =  1  and  pm  is  optimal  with  respect  to  the 
residual  norm  ||rm|]2  (c.f.  [8]  for  other  formulations  of  this  optimal  iteration). 

In  a  practical  implementation,  the  dimension  m  of  the  Krylov  space  is  fixed,  and  the  GMRES 
iteration  is  restarted  with  um  in  place  of  «o-  This  is  the  GMRES(m)  method.  Defining  one  “step" 
to  be  the  average  of  the  m-fold  iteration  divided  by  m,  the  cost  per  step  is  (m  +  3  +  1  /m)N 
operations  plus  one  matrix-vector  product.  It  requires  (m  +  2)  jV  words  of  storage. 
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We  remark  that  the  Arnoldi  process  was  originally  developed  as  a  technique  for  computing 
eigenvalues  [27].  Let  Vm  denote  the  matrix  whose  columns  are  the  m  vectors  generated  by  the 
Arnoldi  step  in  GMRES(m),  and  let  Hm  denote  the  square  upper  Hessenberg  matrix  consisting  of 
the  first  m  rows  of  Hm.  Then  Vm  is  an  orthonormal  matrix  of  order  N  x  m  that  satisfies 

v£AVm  =  Hm.  (2.2) 

Relation  (2.2)  resembles  a  similarity  transformation,  and  Arnoldi’s  method  consists  of  using  the 
eigenvalues  of  Hm  as  estimates  for  (some  of)  the  eigenvalues  of  A.  Suppose  A  =  U \U~l  for  diagonal 
A  and  r0  is  dominated  by  m  eigenvectors  {uj}™=l,  with  corresponding  eigenvalues  {AyJJ-J^.  Then 
the  residual  after  m  GMRES  steps  satisfies  [6] 

Ikmlh  <  IlCW'lh  cm  ||e||2 

where 

m 

= Kp  i* -a/i/m 

and  e  is  orthogonal  to  {tiyjylp  Loosely  speaking,  GMRES(m)  damps  out  from  the  residual  the 
eigenvectors  whose  eigenvalues  are  computed  by  Arnoldi’s  method. 

3.  The  PSUP  Method 

The  gradient  methods  just  described  compute  iterates  and  residuals  that  satisfy  (1.2)  and  (1.3) 
(for  CGN,  with  respect  to  ATA)  in  which  the  polynomials  are  built  up  recursively  without  explicit 
computation  of  their  coefficients.  In  this  section,  we  describe  an  alternative  iteration  that  computes 
explicitly  the  coefficients  of  a  polynomial  qm-i[z)  for  which  pm(z)  =  1  -  zqm-i(z)  is  small  on  the 
spectrum  <r(A).  In  the  following,  we  will  refer  to  the  polynomial  qm-i{z)  of  (1.2)  as  the  “iteration 
polynomial”  and  to  the  polynomial  pm(z)  =  1  -  zqm-\{z)  of  (1.3)  as  the  “residual  polynomial.” 

Suppose  a  compact  region  D  C  C  contains  <r(A).  Let  pm  be  a  polynomial  of  degree  m  that 
satisfies 

Pm{ 0)  =  1,  llPmll  =  max  |pm(z)|  =  £  <  1. 

As  is  evident  from  (1.4),  an  iteration  having  pm  as  its  residual  polynomial  will  result  in  a  decrease 
of  the  residual  norm  if  £  is  small  enough.  The  best  possible  iteration  polynomial  with  respect  to 
this  norm  (the  Chebyshev  norm)  is  the  solution  to  the  minimax  problem 

£  =  min  max 1 1  -  zqm-i(z)\. 

?m-  I  z€D 


(3.1) 
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Let  gm_i(2)  =  YyjLo  aj2* •  The  solution  to  (3.1)  is  also  the  Chebyshev  solution  to  the  infinite 
system  of  equations 

m—  1 

zj+l*j  =  1,  26  dD.  (3.2) 

>=o 

Only  the  boundary  dD  need  be  considered  because  of  the  maximum  modulus  principle. 

The  PSUP  method  uses  an  iteration  polynomial  obtained  from  an  approximate  solution  to 
(3.1).  We  briefly  summarize  the  technique  used;  details  can  be  found  in  [24].  First,  (3.2)  is 
replaced  by  a  finite  dimensional  problem 


m—  1 

y,  zj+1aj  =  1,  zE  d Dm, 
j= o 


(3.3) 


where  dD\{  is  a  finite  subset  of  dD  containing  M  points,  M  >  m.  Equation  (3.3)  is  an  overdeter¬ 
mined  system  of  M  equations  in  the  m  unknowns  {a/Jyljo1.  The  Chebyshev  problem  for  (3.3)  is 
given  by 


m— 1 

min  max  y  2J+1a,-  -  1 

{ay}  zedDM\ 


(3.4) 


Second,  equation  (3.4)  is  solved  approximately  using  a  semi-infinite  linear  programming  approach 
to  complex  approximation,  which  is  based  on  the  identity  |ti/|  =  maxo<«<2>r  Re(we~,e),  w  eC.  Let 
0  =  {0i,..., 0P}  C  [0, 27t),  and  define  the  discretized  absolute  value 


Me  =  max  Re(we  ,s). 


Consider  the  discretized  problem 


min  max 

{ay}  zed  Dm 


m— 1 


y  2;+1a;-  -  1 


j=o 


(3.5) 


where  the  absolute  value  in  (3.4)  is  replaced  by  the  discretized  absolute  value.  This  gives  rise  to  a 
linear  program  for  {ay}™.^1.  Let  e*  denote  the  minimax  value  of  |  H/LV  **+laj  —  1|  at  the  solution 
to  (3.4),  and  let  «*  denote  the  minimax  value  for  (3.5).  It  can  be  shown  that 


Me  <  M  ^  Me  sec( a/2) 


for  all  wEC,  and  consequently  that 


Ep  <  «*  <  £p  sec(a/2), 
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where  a  is  the  smallest  difference  (mod  2 ir)  between  two  neighboring  angles  in  ©. 
bounds  are  sharpest  for  given  p  when  ©  consists  of  the  p-th  roots  of  unity,  so  that  a 
use  this  choice  of  ©  in  the  following,  with  p  =  256  so  that  sec( ot/2)  =  1.000075. 

The  dual  of  the  LP  (3.5)  can  be  written  in  the  form 

min  f?e[ej/5e-,e] 
se rm*’,  Qe r 

subject  to:  S  >  0,  Q  >  0,  ZTSe~,Q  =  0  €  Cm 

M  p 

and  Q  + 

>=i  *=i 

where  e\f  €  CM  is  the  vector  whose  components  are  all  1,  Z  €  CMxm  is  the  coefficient  matrix  of 
(3.4),  and  e-,e  €  Cp  denotes  the  vector  whose  j- th  component  is  .  Q  is  a  slack  variable  which 
must  be  0  if  e*  >  0.  A  straightforward  application  of  the  simplex  method  to  the  dual  requires 
O(Mmp)  multiplications  per  simplex  iteration  and  O(Mmp)  storage  locations.  In  [24],  it  is  shown 
that  the  factor  p  can  be  eliminated  from  these  estimates  by  exploiting  the  special  structure  of  the 
dual.  These  economies  leave  unaltered  the  sequence  of  basic  feasible  solutions  that  the  simplex 
method  generates  en  route  to  the  solution.  Moreover,  they  simplify  further  if  the  coefficients  (ay } 
are  required  to  be  real.  In  practice  the  number  of  simplex  iterations  has  been  observed  to  be  0(m) 
so  that  the  computational  effort  to  compute  {a;}  using  the  algorithm  in  [23]  is  0{Mm2).  In  the 
experiments  discussed  below,  both  M  and  m  are  significantly  smaller  than  the  order  N  of  the  linear 
system  so  that  construction  of  the  coefficients  of  the  iteration  polynomial  is  a  low  order  cost  of  the 
solution  process. 

Given  uo  and  ro,  the  basic  PSUP  iteration  consists  of  repeated  application  of  the  iteration 
polynomial  qm-i<  as  follows: 

Algorithm  1:  The  PSUP  iteration. 

For  k  =  1,2, ...  Do 

ukm  =  w(fc-l)m  "b  ?m-l  (■^)r(*-l)m 
rkm  =  b  —  AU(k.X)m. 

The  actual  computation  w  *—  gm_i(A)r  is  performed  using  Horner’s  rule: 
w  —  am_ir 
For  j  =  1  to  m  -  1  Do 


The  upper 
=  2ir/p.  We 


v  <—  Aw 

w  «-  am_t_yr  +  v. 
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The  m-fold  PSUP  iteration  requires  m  matrix-vector  products  and  m  scalar  vector  products, 
so  that  the  “average”  cost  is  one  matrix-vector  product  and  one  scalar-vector  product.  PSUP 
requires  4 N  storage,  for  u,  r,  v  and  w. 

In  practice,  the  PSUP  iteration  needs  estimates  of  the  eigenvalues  of  A  in  order  to  obtain  the 
set  D.  Several  adaptive  techniques  have  been  developed  for  combining  an  eigenvalue  estimation 
procedure  with  polynomial  iteration  [6,  13,  19].  We  will  use  the  hybrid  technique  developed  in  [6, 
19],  which  uses  Arnoldi’s  method  for  eigenvalue  estimates. 

First,  the  Arnoldi  process  is  used  to  compute  some  number  £,•  of  eigenvalue  estimates  prior  to 
execution  of  the  PSUP  iteration.  Given  these  estimates,  a  set  D  is  constructed  that  contains  them, 
from  which  the  PSUP  iteration  polynomial  qm~i  is  computed.  (We  discuss  our  choice  for  D  below.) 
One  possible  strategy  is  to  perform  the  PSUP  iteration  with  <?m-i  until  the  iteration  converges. 
However,  there  is  no  guarantee  that  all  the  extreme  eigenvalues  of  A  are  computed  by  the  Arnoldi 
procedure.  The  set  D  is  contained  in  the  lemniscate  region  [10]  Lm  —  {z  €  C  |  |pm(2)|  <  e},  where 
€  and  pm  =  1  —  zqm-i(z )  solve  (3.1).  Moreover,  the  modulus  of  pm  is  greater  than  £  outside  Lm 
and  tends  to  grow  rapidly  outside  Lm,  at  least  in  some  directions.  If  an  eigenvalue  A  lies  outside 
Lm  and  |pm(A)|  is  large  enough,  then  the  PSUP  method  will  diverge. 

One  way  to  avoid  this  behavior  is  to  invoke  the  adaptive  procedure:  if  PSUP  diverges  then  ka 
additional  Arnoldi  steps  are  performed  to  compute  ka  new  eigenvalue  estimates.  These  estimates 
are  then  used  to  construct  a  new  enclosing  set  D  and  a  new  iteration  polynomial  gm_i,  with  which 
the  PSUP  iteration  is  resumed.  A  good  choice  for  a  starting  vector  v\  is  the  last  residual  from  the 
previous  PSUP  iteration  (normalized  to  have  unit  norm).  For  if  PSUP  diverges,  then  the  residual 
will  tend  to  be  dominated  by  the  eigenvectors  whose  eigenvalues  are  not  being  damped  out  by  the 
PSUP  polynomial.  Moreover,  this  technique  can  be  improved  using  GMRES.  Once  the  ka  Arnoldi 
vectors  are  available,  the  GMRES(&0)  iteration  (2.1)  can  be  performed  at  relatively  little  extra 
expense.  This  has  the  effect  of  damping  out  from  the  residual  the  eigenvector  components  that 
were  being  enhanced  by  the  previous  PSUP  iteration. 

Rather  than  use  the  PSUP  iteration  alone,  we  consider  a  hybrid  PSUP-GMRES  method  that 
makes  use  of  these  observations.  This  method  consists  of  repeated  iteration  of  some  number  s  of 
PSUP  steps,  followed  by  a  smaller  number  ka  of  Arnoldi-GMRES  steps.  The  initial  eigenvalue 
estimates  are  provided  by  fc,  Arnoldi-GMRES  steps,  where  may  differ  from  ka.  In  addition,  the 
adaptive  procedure  is  invoked  immediately  if  the  residual  norm  of  the  PSUP  iteration  increases 
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by  some  tolerance  r  relative  to  the  smallest  residual  previously  encountered.  The  following  is  a 
modification  of  the  hybrid  method  developed  in  [6]  that  uses  the  PSUP  iteration: 

Algorithm  2\  The  hybrid  GMRES-PSUP  method. 

Choose  uo-  Compute  tq  =  b  —  Auo- 

Until  Convergence  Do 

Adaptive  (Initialization)  Steps:  Set  t>i  =  the  current  normalized  residual, 
perform  ka  (or  &,)  Arnoldi/GMRES  steps,  and  use  the  new  eigenvalue 
estimates  to  update  (or  initialize)  the  PSUP  coefficients. 

PSUP  Steps:  While  (||r>||2/||rmifl||2  <  r)) 

Perform  s  steps  of  the  PSUP  iteration  (Algorithm  1)  to 
update  the  approximate  solution  Uj  and  residual  r;  . 

For  the  enclosing  set  D  we  take  the  union  of  the  four  sets  Dj ,  where  Dj  is  the  convex  hull  of 
the  set  of  eigenvalue  estimates  in  the  j-th  quadrant  of  the  complex  plane.  With  this  choice,  if  the 
extreme  eigenvalues  of  each  quadrant  have  been  computed,  then  all  the  eigenvalues  are  contained 
in  D.  If  all  the  eigenvalue  estimates  in  either  half  plane  are  real,  then  the  part  of  D  containing 
these  estimates  is  taken  to  be  the  line  segment  between  the  leftmost  and  rightmost  estimates  in 
the  half  plane. 

There  is  no  guarantee  that  the  eigenvalue  estimates  computed  by  Arnoldi’s  method  are  accu¬ 
rate.  Moreover,  since  the  PSUP  residual  polynomial  has  the  value  I  at  the  origin,  if  D  contains 
points  with  both  positive  and  negative  real  parts  that  are  near  the  origin,  then  the  Chebyshev  norm 
of  the  residual  polynomial  will  be  very  close  to  1.  (See  Section  4  for  an  example.)  We  consider 
one  heuristic  designed  to  improve  the  performance  of  the  hybrid  PSUP  method  on  problems  with 
eigenvalues  very  near  the  origin:  we  successively  remove  the  points  closest  to  the  origin  from  the 
set  of  eigenvalue  estimates  (and  generate  a  smaller  D)  until  the  norm  of  the  PSUP  polynomial  is 
smaller  than  some  predetermined  value  p,  and  use  that  polynomial  for  the  PSUP  iteration. 

There  are  two  possible  effects  of  this  heuristic.  If  the  deleted  points  are  not  accurate  as 
eigenvalue  estimates,  then  the  resulting  PSUP  iteration  will  be  just  as  robust  and  more  rapidly 
convergent  than  if  the  deleted  points  had  been  included.  On  the  other  hand,  if  the  deleted  points 
are  good  estimates,  then  the  PSUP  polynomial  will  probably  be  large  on  the  deleted  points,  and 
the  iteration  will  not  damp  out  the  residual  in  the  direction  of  the  corresponding  eigenvectors. 
However,  if  the  dimension  of  this  eigenspace  is  small  (say,  2  or  3),  then  the  iteration  should  damp 
out  the  residual  in  all  other  components,  so  that  the  residual  should  be  dominated  by  a  small 
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number  of  components.  In  this  situation,  a  small  number  of  GMRES  steps  should  damp  out  these 
dominant  components.  We  will  refer  to  the  hybrid  PSUP  method  with  this  heuristic  added  as  the 
GMRES/Reduced-PSUP  scheme. 

We  note  that  with  the  methods  of  [24] ,  (3.5)  can  be  also  solved  with  the  constraint 

m—1 

max  >  z1+laj  -  1  <1, 

i= o 

where  E  is  some  finite  set.  In  particular,  if  E  is  the  set  of  deleted  eigenvalue  estimates  in  the 
GMRES/Reduced-PSUP  scheme,  then  the  PSUP  polynomial  on  the  reduced  set  D  can  be  forced 
to  be  bounded  in  modulus  by  one  on  the  deleted  points.  In  experiments  with  this  version  of  the 
GMRES/Reduced-PSUP  iteration,  we  found  its  performance  to  be  essentially  the  same  as  that  of 
the  unconstrained  version  described  above. 


4.  Numerical  Experiments 

In  this  section,  we  compare  the  performance  of  CGN,  GMRES(m),  GMRES/PSUP  and 
GMRES/Reduced-PSUP  in  solving  several  linear  systems  arising  from  a  finite  difference  discretiza¬ 
tion  of  the  differential  equation 


-Au  +  2P\uz  +  2P2uv  -  P3U  =  /,  it  €  G, 


(4.1) 


u  =  g,  u  €  dfl, 

where  Q  is  the  unit  square  {0  <  x, y  <  1},  and  Pi,  P2  and  P3  are  positive  parameters.  We  use 
f  =  g  =  0,  so  that  the  solution  to  (4.1)  is  u  =  0. 

We  discretize  (4.1)  by  finite  differences  on  a  uniform  n  x  n  grid,  using  centered  differences  for 
the  Laplacian  and  the  first  derivatives.  Let  h  =  l/{n+  1).  After  scaling  by  h2,  the  matrix  equation 
has  the  form  (1.1)  in  which  the  typical  equation  for  the  unknown  u,;  &  u(ih,jh)  is 


(4  -  <t)u ij  -  (1  +  i3)u,_i  j  -I-  (-1  +  /3)ui+lJ  -  (1  +  7)«ij-i  +  (-1  +  l)uij+ !  =  h2fjj, 
where  3  =  P\h,  7  =  P2h ,  a  =  P3/12  and  fij  =  f{ih,jh).  The  eigenvalues  of  A  are  given  by  [21] 


4  — <7  +  2  n/1  -  32 


cos - 


sir 

n  +  1 


+  2vT^ 


7 1  cos 


tir 

n+T 


1  <  s,  <  <  n. 


The  eigenvalues  of  the  symmetric  part  are 


,  n  sir  ttr 

4  -  a  +  2cos - -  +  2cos- 


1 


n  -(-  1 


1  <  s,t  <  n. 
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The  leftmost  eigenvalue  of  the  symmetric  part,  corresponding  to  s  =  t  =  n,  is  given  by 

(2tr  2-P3)h2  +  0(h*), 

so  that  for  small  enough  h  the  symmetric  part  is  indefinite  when  P3  >  27t2 . 

Six  test  problems  corresponding  to  six  choices  of  the  parameter  set  {Pi,  Pi,  P3}  are  considered. 
We  use  the  three  values  P3  =  30,  80,  and  250  together  with  each  of  the  pairs  of  values  {Pi  =  1,  P2  = 
2}  and  {Pi  =  25,  Pi  =  50}.  For  all  tests,  n  =  31,  so  that  the  order  N  =  n2  is  969.  For  all  six  test 
problems,  the  coefficient  matrix  A  is  indefinite,  and  the  number  of  negative  eigenvalues  of  (A-t-Ar)/2 
is  increasing  as  P3  grows.  For  the  first  choice  of  the  (Pi,P2)  pair,  A  is  mildly  nonsymmetric  and 
its  eigenvalues  are  real,  and  for  the  second  choice,  A  is  more  highly  nonsymmetric  and  has  complex 
eigenvalues. 

Although  it  is  not  our  intention  here  to  examine  preconditioners  for  indefinite  systems,  pre¬ 
conditioning  has  been  shown  to  be  a  critical  factor  in  the  performance  of  iterative  methods  [3,  5, 
15],  In  our  tests,  we  precondition  (1.1)  by  the  finite  difference  discretization  of  the  Laplacian.  That 
is,  the  iterative  methods  being  considered  are  applied  to  the  preconditioned  problem 

AQ~lx  —  b,  x  =  Q~lx, 

where  Q  is  the  discrete  Laplacian.  (See  [2]  for  an  asymptotic  analysis  of  this  preconditioner  for 
finite  element  discretizations.)  The  preconditioned  matrix-vector  product  then  consists  of  a  pre¬ 
conditioning  solve  of  the  form  Q~xv  and  a  matrix  multiply  of  the  form  Av.  Since  f2  is  a  square 
domain,  the  preconditioning  is  implemented  using  the  block  cyclic  reduction  method  at  a  cost  of 
3n2log2n  operations  [25] .  We  have  confirmed  numerically  that  the  preconditioned  matrix  AQ~l  in 
all  six  problems  has  indefinite  symmetric  part. 

We  use  the  following  parameters  for  the  hybrid  GMRES-PSUP  iteration.  In  an  effort  to  obtain 
the  dominant  and  subdominant  eigenvalues  of  each  quadrant  at  the  outset,  the  initialization  step 
consists  of  eight  GMRES  steps  (&,  =  8)  giving  eight  eigenvalue  estimates.  All  subsequent  calls 
to  the  adaptive  procedure  consist  of  four  GMRES  steps  ( ka  =  4).  For  all  tests  with  PSUP.  we 
use  a  residual  polynomial  of  degree  four  (m  =  4),  and  allow  at  most  s  =  32  PSUP  steps  (or 
eight  successive  applications  of  the  PSUP  polynomial).  The  adaptive  procedure  is  invoked  if  the 
residual  norm  increases  during  a  PSUP  step  (r  =  1),  or  after  s  steps  are  performed.  We  use 
A I  =  100  points  for  the  discretized  enclosing  set  dD\f,  and  allocate  them  so  that  the  number  of 
points  in  each  quadrant  is  approximately  proportional  to  the  circumference  of  the  convex  hull  in 
that  quadrant.  For  subsets  of  D  that  overlap  on  quadrant  boundaries  (e.g.  if  a  line  segment  on  the 
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real  line  is  shared  by  regions  in  the  first  and  fourth  quadrants),  the  shared  boundary  is  discretized 
twice.  For  the  GMRES/Reduced-PSUP  scheme,  in  which  eigenvalue  estimates  closest  to  the  origin 
are  deleted  until  the  minimax  norm  is  less  than  some  tolerance  t),  we  examine  r)  =  . 5  and  .3.  For 
this  scheme,  we  take  ka  to  be  two  plus  the  number  of  eigenvalue  estimates  deleted.  We  use  the 
notation  GMRES-PSUP(m)  (with  m  =  4)  for  the  “unreduced”  scheme,  and  GMRES-PSUP(m,  tj) 
for  the  reduced  version. 

We  examine  GMRES(m)  for  m  =  5  and  m  =  20.  Recall  that  the  latter  version  generates  a 
higher  degree  optimal  polynomial  at  the  expense  of  a  larger  average  cost  per  step. 

All  numerical  tests  were  run  on  a  VAX  11-780  in  double  precision  (55  bit  mantissa).  The 
initial  guess  in  all  runs  was  a  vector  uo  of  random  numbers  between  -1  and  1.  Figures  1  -  6  show 
the  performance  of  the  methods  measured  in  terms  of  multiplication  counts,  for  the  six  problems 
(also  numbered  1-6).  Note  that  the  horizontal  scale  of  Figure  1  is  wider  than  the  others,  and  the 
scales  in  Figures  5  and  6  are  slightly  narrower.  Table  1  shows  the  iteration  counts  needed  to  satisfy 
the  stopping  criterion  of 


\\rjh 
1 1  r0 1 1 2 


<  10~6. 


A  maximum  of  100,  150,  and  200  iterations  were  permitted  for  the  CGN,  GMRES  and  PSUP  meth¬ 
ods,  respectively.  (For  these  iteration  counts,  CGN,  GMRES(20)  and  GMRES-PSUP(4)  performed 
roughly  the  same  number  of  operations.)  Our  main  observations  on  this  data  are: 

1.  Problems  1  and  3  are  solved  efficiently  by  nearly  all  the  methods,  but  for  the  other  four  problems 
convergence  is  slow. 

2.  In  general,  the  hybrid  GMRES-PSUP(m)  scheme  is  weakest.  The  plateaus  in  Figures  3  5 
and  6  for  this  method  correspond  to  the  PSUP  step,  for  which  convergence  is  very  slow. 
The  “reduction”  heuristic  improves  the  performance,  but  the  improvement  is  due  largely  to 
increased  effectiveness  of  the  GMRES  part  of  the  iteration  (e.g.  in  the  steep  drops  of  Figures 
2-4),  and  the  improved  performance  is  not  better  than  that  of  GMRES  alone. 

3.  On  the  whole,  GMRES(20)  and  CGN  are  the  most  effective  methods  for  these  problems,  but 
they  are  not  dramatically  superior  to  the  others.  GMRES(20)  converges  more  rapidly  than 
GMRES(5). 
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Residual  norm  Reslduol  norm 


Mult lpi loot  Ions 

-  CGN  D  -  GMRES-PSUP U) 

-  GMRES(S)  E  -  GMRES-PSUP 14.0.5) 

-  GMRES120)  F  -  GMRES-PSUP (4. 0.3) 

Figure  3:  Pi  =  1,  P2  =  2,  P3  =  80 


31  *  31  OI10.  P!=2S.  P2=50.  P3-0O 


1000000  2000000  3000000  4000000 

5000000 

Multiplication* 

n  -  CGN 

D  -  GMRES-PSUP (4) 

B  -  GMREStS) 

E  -  GMRES-PSUP (4. 0.51 

C  -  GMRESI20) 

F  -  GMRES-PSUP (4. 0.3! 

Figure  4:  Pi 

=  25,  P2  =  50,  P3  =  8C 

Excluding  storage  for  the  matrix  and  right  hand  side,  the  storage  requirements  for  the  methods 
considered  are 


CGN: 

4  N 

GMRES(5): 

7  N 

GMRES  (20): 

22  N 

All  PSUP  variants: 

10  N 

The  high  cost  of  the  PSUP  methods  is  due  to  the  eight  initializing  GMRES  steps. 


Problem  # 

1 

2 

3 

4 

5 

6 

CGN 

mm 

>100 

28 

>100 

>100 

GMRES(5) 

SUB 

>150 

46 

>150 

>150 

>150 

GMRES(20) 

10 

111 

17 

119 

>150 

GMRES-PSUP 

16 

>200 

199 

>200 

>200 

PSUP(4,.5) 

16 

62 

>200 

>200 

>200 

PSUP(4,.3) 

16 

>200 

70 

>200 

>200 

>200 

Table  1:  Iteration  counts. 


Although  the  GMRES/Reduced-PSUP  (PSUP(m,»/))  scheme  is  not  as  fast  as  pure  GMRES, 
the  reduction  heuristic  does  have  its  intended  effect  of  improving  upon  the  hybrid  scheme.  We 
briefly  examine  the  effect  of  the  heuristic  on  Problem  3,  focusing  on  two  curve  segments  of  Figure  3: 
the  plateau  of  curve  D  (GMRES-PSUP(4))  between  multiplication  counts  200000  and  300000,  and 
the  last  plateau  in  curve  E  (GMRES-PSUP(4,.5)).  For  curve  D,  on  return  from  the  adaptive  step 
at  about  multiplication  count  200000,  the  real  parts  of  the  eigenvalue  estimates  lie  in  the  intervals 
[-3, -.33]  and  [0.4, .98],  the  Chebyshev  norm  of  the  residual  polynomial  is  .98,  and  convergence  is 
slow.  For  curve  E,  on  return  from  the  adaptive  step  prior  to  the  last  plateau  of  the  curve,  the  real 
parts  of  the  eigenvalue  estimates  lie  in  the  intervals  [-3, -.56]  and  [.05, .97],  and  the  Chebyshev  norm 
is  .96.  The  effect  of  deletion  of  points  is  shown  in  Table  2.  The  Chebyshev  norm  is  very  large  when 
there  are  points  near  the  origin,  and  it  declines  as  these  points  are  deleted.  The  deletion  of  points 
does  not  significantly  hurt  the  PSUP  part  of  the  iteration  and  it  strongly  enhances  the  effect  of  the 
GMRES  steps. 

We  remark  that  we  also  considered  other  variants  of  the  PSUP  iteration.  In  experiments  with 
degrees  m  =  6  and  10  the  performance  of  PSUP  was  essentially  the  same.*  Moreover,  as  we  noted 

*  In  lome  teat*  with  degree  16,  we  were  unable  to  generate  the  polynomial  coefficients.  We  believe  the  choice  of  the  powers 
of  i  as  basis  functions  makes  (3.5)  ill  conditioned  for  large  m;  see  [19].  In  addition,  the  implementation  based  on  Homer’s  rule 
may  suffer  from  instability  for  large  m. 


Deleted 

Points 

Intervals  Containing 
Real  Parts 

Chebyshev 

Norm 

- 

[-3,  -.56],  [.05, .97] 

.96 

.05 

[-3,  -.56],  [.34, .97] 

.76 

.34 

[-3,  -.56],  [.61, .97] 

.55 

-.56 

[-3, -1.46],  [.61, .97] 

.33 

Table  2:  Effect  of  point  deletion  on  GMRES/Reduced-PSUP(4,.5) 
for  Problem  3. 

in  Section  3,  a  variant  of  the  GMRES/Reduced-PSUP  in  which  the  PSUP  polynomial  is  constrained 
to  be  bounded  in  modulus  by  one  on  the  set  of  deleted  eigenvalue  estimates  displayed  about  the 
same  behavior  as  the  unconstrained  version.  Similarly,  we  tested  LSQR  [16],  a  stabilized  version 
of  CGN,  and  found  that  its  performance  was  nearly  identical  to  CGN. 

5.  Conclusions 

The  GMRES  and  PSUP  methods  are  iterative  methods  that  are  optimal  in  the  class  of 
polynomial-based  methods  with  respect  to  the  Euclidean  or  to  norms  respectively,  for  arbitrary 
nonsingular  linear  systems.  For  linear  systems  in  which  the  coefficient  matrix  is  either  symmetric  or 
definite  (or  both),  these  types  of  methods  are  effective  solution  techniques  [3,  5].  In  particular,  they 
are  superior  to  solving  the  normal  equations  by  the  conjugate  gradient  method.  In  the  results  of 
Section  4,  the  methods  based  on  polynomials  in  the  coefficient  matrix  are  not  dramatically  superior 
to  CGN,  especially  for  systems  that  are  both  highly  nonsymmetric  and  highly  indefinite.  GMRES 
appears  to  be  a  more  effective  method  than  PSUP. 

We  note  that  the  best  results  for  other  classes  of  problems  depend  strongly  on  preconditioning. 
We  used  the  discrete  Laplacian  as  a  preconditioner  in  our  experiments,  and  the  large  iteration/work 
counts  in  the  results  show  that  this  is  not  a  good  choice  for  the  given  mesh  size  when  the  coefficients 
in  the  differential  operator  are  large.  We  believe  that  improvements  in  preconditioners  are  needed 
to  handle  this  class  of  problems. 
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